Towards Automatic Intoxication Detection from Speech in Real-Life Acoustic Environments
نویسندگان
چکیده
In-car intoxication detection from speech is a highly promising non-intrusive method to reduce the accident risk associated with drunk driving. However, in-car noise significantly influences the recognition performance and needs to be addressed in practical applications. In this paper, we investigate how seriously the intrinsic in-car noise and background music affect the accuracy of intoxication recognition. In extensive test runs using the official speech corpus of the INTERSPEECH 2011 Intoxication Challenge, realistic car noise and original popular music we conclude that stationary driving noise as well as music introduce a significant downgrade when acoustic models are trained on clean speech only, which can partly be alleviated by multi-condition training. Besides, exploiting cumulative evidence over time by late decision fusion appears to be a promising way to further enhance performance in noisy conditions.
منابع مشابه
Towards Natural Acoustic Interfaces for Automatic Speech Recognition
Aiming at ’natural’ hands-free acoustic human/machine interfaces, the need for according distant-talking automatic speech recognition (ASR) systems increases and presents us with major signal processing challenges at the acoustic front-end. Considering interactive TV as a challenging exemplary application scenario, we investigate the structural problems presented by noisy and reverberant multi-...
متن کاملFace reading from speech - predicting facial action units from audio cues
The automatic recognition of facial behaviours is usually achieved through the detection of particular FACS Action Unit (AU), which then makes it possible to analyse the affective behaviours expressed in the face. Despite the fact that advanced techniques have been proposed to extract relevant facial descriptors, the processing of real-life data, i. e., recorded in unconstrained environments, m...
متن کاملAutomatic detection of speaker state: Lexical, prosodic, and phonetic approaches to level-of-interest and intoxication classification
Traditional studies of speaker state focus primarily upon one-stage classification techniques using standard acoustic features. In this article, we investigate multiple novel features and approaches to two recent tasks in speaker state detection: level-of-interest (LOI) detection and intoxication detection. In the task of LOI prediction, we propose a novel Discriminative TFIDF feature to captur...
متن کاملDetecting Intoxication in Speech
Researchers at Columbia are investigating ways to automatically detect intoxication in speech. William Yang Wang, currently a PhD student at Carnegie Mellon that worked on this team while a Master's student, discussed the project and its goals with us. OVERVIEW Imagine a world where DUI's (driving under the influence violations) never occurred. How can this happen? Traditionally devices like br...
متن کاملA Robust, Real-time Endpoint Detector with Energy Normalization for Asr in Adverse Environments
When automatic speech recognition (ASR) is applied to hands-free or other adverse acoustic environments, endpoint detection and energy normalization can be crucial to the entire system. In low signal-to-noise (SNR) situations,conventional approaches of endpointing and energy normalization often fail and ASR performances usually degrade dramatically. The goal of this paper is to find a fast, acc...
متن کامل